Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures - Microarchitecture, 1996., IEEE/ACM International Symposium on
نویسندگان
چکیده
To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers offunctional units. Instruction fetch rate must also be increased in order to effectively exploit the performance potential of such processors. Block-structured ISAs provide an effective means of increasing the instruction fetch rate. We define an optimization, called block enlargement, that can be applied to a block-structured ISA to increase the instruction fetch rate of a processor that implements that ISA. We have constructed a compiler that generates block-structured ISA code, and a simulator that models the execution of that code on a block-structured ISA processox We show that for the SPECint95 benchmarks, the blockstructured ISA processor executing enlarged atomic blocks outperforms a conventional ISA processor by 12% while using simpler microarchitectural mechanisms to support wideissue and dynamic scheduling.
منابع مشابه
A Trace Cache Microarchitecture and Evaluation
As the instruction issue width of superscalar processors increases, instruction fetch bandwidth requirements will also increase. It will eventually become necessary to fetch multiple basic blocks per clock cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. Trace caches overcome this limitation by caching tra...
متن کاملCode Compression for VLIW Processors
Code compression is an important issue in the design of an embedded system, since memory has been one of the most restricted resources. Most of the previous work in code compression has targeted RISC architectures, although VLIW processors have gained a lot of popularity recently. In this research, we explore methods to the problem of compressing code for VLIW processors. Previous code compress...
متن کاملOptimizations Enabled by a Decoupled Front-End Architecture
ÐIn the pursuit of instruction-level parallelism, significant demands are placed on a processor's instruction delivery mechanism. Delivering the performance necessary to meet future processor execution targets requires that the performance of the instruction delivery mechanism scale with the execution core. Attaining these targets is a challenging task due to I-cache misses, branch mispredictio...
متن کاملAppears in the 36 th International Symposium on Microarchitecture (MICRO-36 2003)
Silicon technology will continue to provide an exponential increase in the availability of raw transistors. Effectively translating this resource into application performance, however, is an open challenge. Ever increasing wire-delay relative to switching speed and the exponential cost of circuit complexity make simply scaling up existing processor designs futile. In this paper, we present an a...
متن کاملTrace Cache Performance
Instruction fetch mechanism is a performance bottleneck of a Superscalar Processor. Fetch performance can be improved with the aid of an instruction memory known as a Trace Cache. This paper presents analytical expressions, which describe instruction fetch performance of a Trace Cache microarchitecture. The instruction fetch rates predicted by the expressions differ by seven percent from the si...
متن کامل